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ABSTRACT 

In this paper, we examine the validity of non-parametric spatial bootstrap 
as a procedure to quantify errors in estimates of iV-point correlation functions. 
We do this by means of a small simulation study with simple point process 
models and estimating the two-point correlation functions and their errors. The 
coverage of confidence intervals obtained using bootstrap is compared with those 
obtained from assuming Poisson errors. The bootstrap procedure considered here 
is adapted for use with spatial (i.e. dependent) data. In particular, we describe a 
marked point bootstrap where, instead of resampling points or blocks of points, 
we resample marks assigned to the data points. These marks are numerical values 
that are based on the statistic of interest. We describe how the marks are defined 
for the two- and three-point correlation functions. By resampling marks, the 
bootstrap samples retain more of the dependence structure present in the data. 
Furthermore, this method of bootstrap can be performed much quicker than some 
other bootstrap methods for spatial data, making it a more practical method with 
large datasets. We find that with clustered point datasets, confidence intervals 
obtained using the marked point bootstrap has empirical coverage closer to the 
nominal level than the confidence intervals obtained using Poisson errors. The 
bootstrap errors were also found to be closer to the true errors for the clustered 
point datasets. 

Subject headings: methods: statistical 



Introduction 



In analyses of survey data, s uch as those of galax i es or quasars, iV-point correlation 



funct ions are often estimated (e.g. iKulkarni et al.l 120071 ; iMcCracken et al.l 120071 ; IShen et al. 



20071 ). These help to describe the structure of the observed objects, such as the filamentary 
structure of galaxies that has been observed, and to constrain the parameters of cosmological 
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models. Such estimates are only as important as their associated errors, since it is the 
errors that indicate the amount of agreement between two sets of data or between data and 
simulations from a model. 

In spatial point processes, the expressions for the standard errors of correlation (and 
similar ) func tions have been worked out only for the simplest of models. For example, 



Ripley! (119881 ) found approximations for the variance of the K function, an integral of the 
two-point correlation function, for the Poisson process. These depend on such factors as 
the shape of the observation region and the type of correction method for boundary effects. 



Landy fc Szalayl (119931 ). Hamilton! (119931 ) and others worked out approximations for standard 
errors of various estimators of the two-point correlation function under the Poisson or weakly 
clustered models. Thus, very often, approximations such as Poisson errors are used instead. 
However, point data arising in astronomy are typically clustered and non-Poisson. So while 
Poisson errors are useful and easy to compute, they only serve as rough indications of the 
size of errors. 

Besides using Poisson errors, errors can also be estimated by using mock catalogs gen - 



erated from a cosmological model. This method was employed in lEisenstein et al.l ( 120051 ). 
where the initial conditions for the cosmological model were selected independently. In the 
statistics literature, this is referred to as parametric bootstrap, although the term more com- 
monly refers to mock datasets generated from the model using parameter values fixed at the 
estimates from the data, instead of being independently chosen. 

An alternative is to use non-parametric bootstrap. This involves generating new sam- 
ples, called bootstrap samples, by resampling from the actual data, and computing estimates 
for these new samples. The distribution of these bootstrap estimates then serves as a proxy 
for the actual distribution of the data estimates, so that statistical inference, such as the 
construction of confidence intervals, can be performed. Note that the procedure does not 
make any specific model assumptions, thus the errors obtained by this method can serve as 
a check of model assumptions. Due to the simplicity and flexibility of the non-parametric 
bootstrap, the method is attractive. What is desirable then is to make the non-parametric 
bootstrap procedure work as well as possible for data that is correlated, and check that it 
performs satisfactorily, so that it can be useful as a tool in analysis. 

This paper thus examines the non-parametric bootstrap, specifically bootstrap of spa- 
tial data, where the dependence present in the data is of interest. There were some early 
misconceptions about how bootstrap should be applied to spatial data. The naive method 
of resampling individual points does not work in the spatial context. In order for spatial 
bootstrap to be valid, the underlying dependence structure has to be preserved as much as 
possible when generating bootstrap samples. Two common methods for doing this are the 
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block bootstrap and subsampling where blocks of data, instead of individual data points, are 
resampled. We introduce these methods in Sectio n [2] and describe th eir shortcomings. In 



Section [3] we describe the marked point bootstrap (lLoh &: Steinl 120041 ) as a way to address 
these shortcomings. We describe how the marked point bootstrap can be used with the 
two- and three-point correlation function estimators, and by extension to estimators of Ap- 
point correlation functions. In Section @] we present results of a simulation study using simple 
point process models comparing the empirical coverage of confidence intervals obtained using 
non-parametric bootstrap and using normal approximations with Poisson errors. 

In this paper, we restrict ourselves to constructing nominal 95% confidence intervals, 
i.e. these confidence intervals are supposed to contain the true value 95% of the time. The 
empirical coverage of the confidence intervals is the actual confidence level achieved by the 
confidence intervals. In a simulation study with a known model, the empirical coverage can 
be obtained by finding the number of confidence intervals that contain the true value and 
then compared with the nominal level. It is desirable, of course, for the empirical coverage 
to be close to the nominal level. Furthermore, it is often better for the empirical coverage 
to be higher instead of lower than the nominal level, so that the procedure is conservative. 

Bootstrap is a computationally intensive procedure. With the large datasets now com- 
mon in astronomy, even computing the iV-poin t correlation functions pose computational 



challenges. For example, lEisenstein et al.l (120051 ) avoided using the jackknife procedure for 



error estimation because of the size of the data they used. The way the marked point boot- 
strap is formulated, however, makes it much faster than subsampling (a generalization of 
the jackknife) and the block bootstrap, so that applying the procedure to large datasets is 
feasible as long as computing the actual estimates is feasible. In Section H] we provide some 
time measurements of the procedure used in our simulation study. 



2. Non-parametric bootstrap for spatial data 



The non-parametric bootstrap was originally developed for independent data (jEfron fc Tibshirani 



1994j ) . The main idea is to draw new samples from the actual data by sampling with re- 
placement a data point at a time. Bootstrap estimates of the same statistic are computed 
from the bootstrap samples. With these bootstrap estimates, confidence intervals, for ex- 
ample, can then be constructed. This can be done in a variety of ways. Suppose K, K and 
K*,i = 1,...B are respectively the quantity of interest, the estimate of K computed from 
the data and the bootstrap estimates, with B equal t o the number of bootstrap samples. A 
simple method, called the basic bootstrap interval in iDavison fc Hinkley (Il997h . is to set 



[2K ^(B+l)(l-a/2)' K*B+l)a/2l 
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as the 100(1 — a)% confidence interval for K. Here, B, the number of bootstrap samples, 
is large, say, 999, and K* is the w-th ordered values of the bootstrap statistic. So, for 
example, with B = 999 bootstrap samples, a 95% confidence interval for K is given by 
[2K — Kg 75 ,2K — Kzz]. In our simulation studies, we use (jTJ) to construct the confidence 
intervals. Standard errors for K are estimated by the standard deviation of the bootstrap 
estimates K*. 



Wh ile there are other method s of constructing confidence intervals from bootstrap sam- 
ples (see lDavison fc Hinkleyill997l . for example), the interest here is in the method of generat- 
ing the bootstrap samples, when the data is spatial. ISnethlagd (ll999l ) rightly concludes that 
resampling individual points do not work. If the resampled points are placed in their original 
positions in the observation region, there will be multiple points at single locations, which do 
not usually occur in most data sets. In their claim that b ootstrap cannot be used for analy- 
sis for clustering, ISimpson fc Mayer-Hasselwanderl (119861 ) were also considering bootstrap in 
terms of resampling individual points. 

Due to the success of bootstrap for resampling independent data, it has been extended 
to resample dependent data. Most of this work is for time series, but can easily be applied 
to spatial data in two and three dimensions. A common method is the block bootstrap: 



(Hall 


1985; 


Kiinsch 


1989; 


Liu & Singh 


1992) 



bootstrap involve limiting the range of dependence, increasing the observation region size 
and letting the resampling block size increase but at a slower rate than the observation 
region size. In this asymptotic setting, we then have many almos t independen t blocks of 
data, with each block itself containing a large subsample (see e.g. iLahirilbooj ). However, 



the assumed conditions necessary to make the calculations tractable also means that normal 
approximations work well too. Some theoretical results show that the accuracy of bootstrap 
estimates is of a higher order than the normal approximations. Whether this difference 
is meaningful in actual practice is less clear. We believe that the role of non-parametric 
bootstrap is to serve as another objective method to obtain standard errors that do not 
make any model assumptions. Error estimates obtained using bootstrap can be used as a 
way to assess or compare with other estimates of errors. 



Kemball fc Martinsekl (120051 ) is a recent work in the astronomy literature that examined 
bootst rap for dependent data. For the non-param etric bootstrap, they focused on subsam- 
pling (IPolitis fc Romanolll993l ; iPolitis et al.lll999l ). which can be considered a generalization 
of the jackknife procedure. In subsampling, random portions of the data are deleted, and the 
remaining data are treated as bootstrap samples. The standard deviation of the estimates 
computed from these samples serves as an estimate of the error, less a factor to adjust for 
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the smaller subsamples and the large overlap. 



When estimating correlation functions, pairs or triplets etc of points have to be counted. 
By joining independently resampled blocks together to form the bootstrap sample, the block 
bootstrap creates artifical configurations of points across the resampling blocks and distorts 
the dependence structure in the data. This does not matter in asymptotic arguments because 
the effect becomes negligibl e if the range o f the correlation is fixed while the resampling blocks 
increase in size. However, lLoh fc Stein! (120041 ) found that the actual coverage achieved by 
confidence intervals obtained using block bootstrap can be much lower than the nominal 
percentage level for finite samples. 

In subsampling, no artificial configurations of points are created. However, while the 
correction weight accounts for the difference in sample sizes between the bootstrap samples 
and the actual data set, it does not account for the change in the boundary effects due to 
the different resampling regions. Since subsampling uses smaller regions as the bootstrap 
observation regions, boundary effects are magnified. For subsampling, there is the temptation 
to use large subsamples to try and retain more of the dependence structure, but like block 
bootstrap, theoretical justification of the method r equire s that the subsamples be small in 
size relative to the actual data set. lLoh fc Stein! (12004! ) also found that subsampling can 
yield confidence intervals that attain very low empirical coverage. They also found that the 
subsampling method is sensitive to the fraction of the data used for subsampling. 



Loh &: Stein! (120041 ) proposed another version of spatial bootstrap, called marked point 



bootstrap, that reduces the effect of joining independent blocks and produces confidence 
intervals that achieve coverage closer to the nominal level. This is described in the next 
section, where we also show how it can be applied to the two- and three-point correlation 
function estimators commonly used in astronomy. 



3. Improving the non-parametric bootstrap of spatial data 

Suppose N points are observed in a region A. Furthermore, suppose that the quantity 
of interest K can be estimated using an estimator of the form 



N N 

1 . 

K 



1 N N 

-X;x)^.*i)=^ (2) 



N 

1=1 3=1 



Note that each point i has an associated quantity ^2f=x <f>(xi, Xj), the inner sum of equation 
(J2J). Estimators of two-point statistics can be expressed in this form. In this case, the quantity 
(j)(xi,Xj) will depend on the distance between Xi and Xj. As an aside, note that estimators 
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of three-point statistics can be written in a similar form, with the inner sum replaced by a 
double sum. 

With point data, the term "mark" is used to refer to some additional information 
associated with a point. This is usually some actual measured value. For galaxy data, for 
example, marks could be quantities such as luminosity, color and so on. In this paper, the 
bootstrap method considered uses marks associated with the points. However, these marks 
are not quantities such as luminosity that are directly measured. Instead they are numerical 
quantities that we construct and associate with the points. The actual values of these marks 
are not random, but are constructed so that they relate to the statistic that is of interest. 
If the statistic of interest is given by equation (j2J), then the mark associated with point 
i, denoted by m^, is equal to J2f=i j^i 4>{ x ii x j), so t na t <f> = J2i m i- At the risk of being 
repetitive, suppose that $ = DD{r) = J2 x eD J2 y eD- y ^x — v\ e (r — dr,r + dr)}, for some 
r, is of interest. This quantity is used in estimators of the two-point correlation function, 
and is the number of pairs of points separated by (roughly) distance r. Then the mark 
associated with point x is ^2 yeD . y ^ x — y\ £ (r — dr,r + dr)}, the number of points that 
are roughly distance r away from x. Note that the sum of all the marks gives back the value 
of DD(r). It is also important to note that to compute the estimate (121), the marks have to 
be calculated anyway. In regular applications, the algorithm doing the estimation does not 
individually record these marks, but keeps a running sum of the marks. In order to do the 
marked point bootstrap, the difference in terms of the code is that the marks now have to 
be stored so that they can be used in the bootstrap step. 

In the block bootstrap, blocks of data points are resampled and then joined together, 
forming a new dataset from which K is estimated using the new configuration of points that 
was generated, yielding K*. In the marked point bootstrap, blocks can be used to resample 
points as well. However, the crucial difference is that the bootstrap estimate is computed, 
not from how the resampled points are positioned, but from the marks that are associated 
with these points. In other words, the marked point bootstrap resamples the marks rather 
than the points and the bootstrap estimate is computed by summing these resampled marks. 

To be more precise, suppose that N* number of points have been resampled, with the 
resampled points denoted by x*, j = 1, . . . iV*. Associated with each x* is a mark rrij*. We 
denote this mark rrij* rather than m* to emphasize the fact that these marks are sampled 
from the actual data, i.e. computed from the original dataset and not from the bootstrap 
sample. Then the bootstrap estimate of K is given by the average of the resampled marks: 

N* 

K* = $*/N* = J^m^/N*, 

3=1 
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just like K is given by the average of the actual marks. Note that in an actual implementation 
of the procedure, all that is required is keeping track of how many times each point is 
resampled. The step-by-step procedure for estimating and resampling the quantity (j2j) is as 
follows: 



1. For each point i, calculate m ; = YLj=ij& ( f > ( x i> x 3)- 

2. Obtain the estimate K, using K = Y^i m i/N- 

3. Resample the points. This can be done by randomly placing blocks on to the observa- 
tion region and keeping track of which point is resampled. Suppose point i, % = 1, . . . , N 
is resampled n* times, and N* = J2i n i- 

4. The bootstrap estimate is then K* = x m^/N* 

5. Repeat steps 3 and 4 to get B bootstrap estimates. 

6. Construct a confidence interval using (pQ). 



A few remarks about the procedure are in order. 

Remark 1 Instead of randomly placing blocks, the observation region can be divided into 
a number of subregions, and the regions selected randomly with replacement. This latter 
method is sometimes referred to as using fixed blocks as opposed to moving blocks. It is 
generally considered that the moving blocks bootstrap works better in terms of convergence 
rates in asymptotic arguments. 

Remark 2 The number of blocks used is so that the total area/volume of the blocks is 
equal to the original area/volume of the observation region. Note that in this case N* would 
usually not be equal to N, though they will be of the same order of magnitude. However, 
this does not pose problems since the statistic K is a mean of the marks. 



Rema rk 3 There is no real consenus on the size of the resampling blocks to use. iBiihlmann fc Kiinsch 



(119991 ) did some work on determining the optimal block size from data. Intuitively, the pro- 



cedure needs large blocks so that the correlation structure is less distorted, and a large 
enough number of blocks so that there is enough variability between bootstrap samples. If 
K represents the number of blocks and N the number of data poi nts (which is assumed 



to increase with the observation region size), theoretical work in e.g. lLahiril (120031 ) suggests 
that consistency is achieved as K — > oo and N/K — > oo. Thus some trade-off is needed. A 
rule-of-thumb is to divide each dimension of the observation region into at least three parts, 
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i.e. nine blocks in 2D, 27 blocks in 3D. This would ensure enough variability between boot- 
strap samples. Of course, for correlation functions, the maximum value of the separation 
distance r at which these functions are estimated would influence the decision on block size. 



Fortunately, lLoh Sz Stein! (12004 ) found that the marked point bootstrap is less sensitive to 
block size than block bootstrap or subsampling: they resampled an absorber catalog using 
slices of the sphere and found that the bootstrap errors were similar for different sizes of the 
slices. Our simulation results also show little difference due to different block size. 

There are a few advantages to this form of spatial bootstrap over the regular block 
bootstrap. Since the bootstrap estimates are based on the resampled marks and not on 
marks recalculated from the bootstrap sample, the contribution to the bootstrap estimate is 
due to actual pairs of points in the original dataset. This helps to minimize the distortion 
of dependence structure in the dataset due to resampling. 

Furthermore, for any block of resampled points, information about the points just out- 
side the block (and therefore not sampled by this particular block) is captured by the marks 
associated with the points that are sampled by the block. This helps to reduce the variability 
of bootstrap results due to the size of the resampling blocks, compared to block bootstrap 
or subsampling. Also, since the resampling blocks do not need to be joined together to form 
a contiguous region for the bootstrap sample, there is flexibility in the choice of the shape 
of the resampling regions. 

Lastly, the marked point bootstrap can be performed relatively quickly compared to 
block bootstrap or subsampling. The marks that are associated with the points are part of 
the actual estimator and are already computed in the estimation step. Resampling using 
the marked point bootstrap only involves identifying which points are resampled with each 
resampling region, and keeping track of how many times each point is resampled. Inter- 
point distances and edge correction weights do not have to be recalculated. With N data 
points and B bootstrap samples, block bootstrap will take roughly BN 2 computations for a 
statistic involving pairs of points. The marked point bootstrap will involve roughly N 2 + BN 
computations. The difference will be more marked for three-point computations. 



Simulation studies done in lLoh &: Stein! (120041 ) showed that the empirical coverages of 



confidence intervals obtained using the marked point bootstrap can be much closer to the 
nominal 95% level than those obtained with block bootstrap or subsampling. 

We now describe how the marked point bootstrap can be used with estimators of the 
two-point correlation function. The common estimators of the two-point correlation function 
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£(r) are 



Iff am (r) 

£,Landy (?") 
£,Hewett(r) 



dd(r) 
rr(r) 
<ic?(r) 
dr(r) 

dc?(r) ■ rr(r) 



dr(r 


) 2 


dd(r) — 


2dr(r) 


rrl 


r) 


dd(r) — 


dr(r) 



1. 

+ 1, 



rr(r) 



which are, respe c tively, the natura l estim a tor (IKerscher et a" 



. J 1 . 

Davis & Peebles 


(1983): 


Hamilton 


(1993 





2000) 



(3) 
(4) 
(5) 
(6) 
(7) 

and estimators due to 



Hewettl (119821 ). where r 



is some distance of interest. In these expressions, dd(r) = DD(r)/N 2 , dr(r) = DR(r)/NNn 
and rr(r) = RR(r)/N^, where DD{r) = J2 xED J2 yeD:y ^ x 1{|# — y\ £ (r - dr,r + dr)}/N 2 , 



e Cr-dr,r + dr)}/iViVfl and iU2(r) =E x6Jl E w6 j^l{k- 
G (r — dr, r + dr)}/iV|,, R is a set of randomly generated points (i.e. Poisson) in the 

observation region A, and iV and Nr are respectively the number of points in the real and 

random data sets. 



DR(r) 

y\ 



To apply the marked point bootstrap, assign to each point x of the dataset marks 
m X: i = J2yeD: V ^x 1 {\ x -y\ e {r - dr } r + dr)} and m x>2 = J2 y eR ~ v\ G (r - dr,r + dr)}. 
Bootstrap proceeds by resampling blocks of points and recording the marks associated with 
them. For a bootstrap sample, x*,j = 1, . . . N*, we then have 



DD*(r) 



N* 
3=1 



N* 



DR*(r) = J2 m ^,2, 

3=1 



and bootstrap estimates of the two-point correlation functions are then obtained by substi- 
tuting the above into flSJ)-©- If each point Xj of the actual data is resampled n* times, so 
that N* = Yl,i n *ii DD*{r) and DR*(r) can also be written as 



A' 



N 



DD*(r) = J>* x m^i), DR*(r) = X m Xi>2 ). 



i=l 



Note that RR does not need to be resampled, since it is used as an approximation to 
an integral and has nothing to do with the actual data. If, as is usually the case, estimation 
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of £(r) is needed for a range of values of r, then the marks m X) i and m x ,2 would be vectors, 
containing the relevant values for each value of r. 

Estimators of the three-point correlation function can be bootstrapped in a similar way. 
For example, an estimator of the three-point correlation function is 

ddd — ddr 



c 



+ 2, 



introduced bv IPeebles fc Grothl fll975f >. where ddd = DDD /N 3 , ddr = DDR/N 2 N R and 
rrr = RRR/N^ and DDD,DDR,RRR are counts of triplets of points with the desired 
configuration, DDD with all points from the real data set and so on. The contribution to 
DDD by any particular triplet of points is divided by 3 and assigned as marks to each of the 
three points. For any individual point, all these marks are summed together. For DDR, the 
contribution by each triplet is divided by 2 and assigned to the two real data points. Boot- 
strap proceeds by resampling the real data points and the values of DDD* and DDR* found 
by adding the marks of the resampled points. Substituting these into (jHJ) gives the boot- 
strap estimate. Other similar estimat ors, such as the th r ee-poi nt estimator of iJing fc Borner 
(119981 ) or the iV-point estimators of ISzapudi fc Szalayl (119981 ). can be bootstrapped in the 
same way. 



4. Simulation study 



We performed a simple simulation study to compare the performance of confidence 
intervals obtained using the marked point bootstrap with those obtained using normal ap- 
proximations with Poisson errors, varying the observation region size, number density and 
point process model. For computational simplicity, we restrict to two dimensions. We also 
performed an additional study with a large observation region and approximately 50,000 
points, showing the applicability of the marked point bootstrap to datasets of size compa- 
rable to current astronomy datasets. 

We used the Poisson point process model and a Neyman-Scott model to generate the 
data poi nts. The Neyman-Scott model is of historical interest in astronomy as a model for 
galaxi es (INeyman fc Sc ott 19 521) . It is st ill commonly used to model point data in other 



fields (IDiggk 



2003 



Waagepetersenl 120071 ). We chose the Neyman-Scott model as it is a 



model for clustered data with closed-form expressions for the two-point correlation function. 
The Neyman-Scott point datasets that we used are generated as follows: parent points are 
distributed as a Poisson point process with intensity X p . A Poisson number with mean m 
of offspring points are then randomly scattered about each parent point. The collection of 
offspring points form the point process. We set the dispersion function of offspring points 
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about parent points to be a bivariate normal density centered at the parent point, with 
standard deviation a. T his specific Neyman -Scott model is sometimes referred to as the 



modified Thomas model (jStoyan et al.lll995l ). The two-point correlation function, £(r), is 



zero for the Poisson model, while 

^ (r) = 4^ 6XP {-^ 

for the modified Thomas model. Thus the point pattern from a modified Thomas model is 
more clustered if A p or o is smaller. The quantity a also controls the range of the correlation, 
with the range larger for larger values of a. We used several values for X p , m and a in our 
simulation study. 

For each point process model, we generated 500 realizations on the unit square. For each 
realization, we estimated £(r) for r = 0.01, . . . , 0.1. Bootstrap estimates were then produced 
from each realization and a nominal 95% confidence interval constructed. Thus for each point 
process model, we have 500 95% confidence intervals. We then checked the the empirical 
coverage, i.e. the proportion of these that contained the true value of with proportion 
closer to 95% being desirable. We also constructed 500 confidence intervals using the normal 
approximation with Poisson errors. The Poisson error e p is the inverse of th e pair counts for 



an un correlated data set of the same size as the actual data, as given by lLandy &: Szalay 



( 119931 ) . The 95% confidence intervals for £ based on the normal approximation are thus 
given by (£ ± 2e p ) . We then found the empirical coverage of these confidence intervals. We 
then repeated the procedure for the 2x2 and 4x4 squares. The results are summarized in 
Figures [TJ to [3j 

Figure [Tj shows plots of the empirical coverage of nominal 95% confi dence intervals o f 



the two-point correlation function for the Poisson process model, using the iHamiltonl (119931 ) 
estimator. Simulation results for the other estimators are similar and are not shown. The 
thick solid lines in the plots show the empirical coverage of confidence intervals obtained 
using normal approximation with Poisson errors. Note that Poisson errors are correct in this 
case and we find that the empirical coverage is close to 95% for all the point densities and 
observation region sizes considered. 

The thin lines represent the empirical coverage of confidence intervals obtained from the 
marked point bootstrap, with the different line types representing different resampling block 
sizes. These were squares of lengths 0.5, 0.33 and 0.25 for the 1 by 1 regions, of lengths 1, 
0.67, 0.5, 0.33 for the 2 by 2 regions and of lengths 2, 1, 0.67, 0.5 for the 4 by 4 regions 
(solid, dashed, dotted and dashed-dotted lines respectively for increasingly smaller blocks). 
The difference due to the block size used for resampling appear to be s mall. As mentioned, 



this was an advantage of the marked point bootstrap. lLoh fc Stein! (12004 ) found greater 
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variation of performance with block size for subsampling and block bootstrap. 

Compared with the Poisson empirical coverage, we find that at low densities and smaller 
observation region sizes (plots towards the upper left of Figured]), the bootstrap method 
does poorly. However, the empirical coverage of the bootstrap confidence intervals quickly 
increases towards 95% with increasing density (down the columns in Figured]) and/or ob- 
servation region size (across the rows in Figured]), i.e. with larger sample sizes. 
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Fig. 1. — Plots of the empirical coverage of nominal 95% confidence intervals of the two- 
point correlation function for the Poisson point process model. The estimator used is that of 



Hamilton! (119931 ) . Confidence intervals are obtained using normal approximation with Pois- 



son errors (thick solid line) and with the marked point bootstrap using different resampling 
block sizes. The block sizes were squares of lengths 0.5, 0.33 and 0.25 for the 1 by 1 regions, 
of lengths 1, 0.67, 0.5, 0.33 for the 2 by 2 regions and of lengths 2, 1, 0.67, 0.5 for the 4 by 
4 regions (solid, dashed, dotted and dashed-dotted lines respectively for increasingly smaller 
blocks). 
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Fig. 2. — Plots of the empirical coverage of nominal 95% confidence intervals of the two- 
point correlation function for the modi fied Thomas pro cess model for realizations in a 2 x 2 
square. The estimator used is that of iHamiltonl (I1993I ). Confidence intervals are obtained 
using normal approximation with Poisson errors (thick solid line) and with the marked point 
bootstrap using different resampling block sizes (see text). 
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Fig. 3. — Plots showing the true (solid), Poisson (dotted) and bootstrap (dashed) errors in 
estimates of £ for 500 sets of data simulated in a 2 by 2 square region using each of various 
point models. The true errors are obtained from the variability in the estimates of £ over 
the 500 data sets. For each data set, Poisson and bootstrap errors are computed. The errors 
shown in the plots are the average over the 500 data sets. 
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Fig. 4. — Plots showing sample realizations of the modified Thomas model corresponding 
to four different sets of parameter values, simulated on a 20 x 20 square. The degree of 
clustering is higher in the top row, while the range of clustering is larger in the right column. 
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Fig. 5. — Plots of the coverage (left) and errors (right) for the bootstrap (solid) and Poisson 
(dashed) methods, based on 100 simulated realizations from the Thomas model on the 20 x 20 
square. The thick solid lines in the plots on the right column represent the true errors. 



-18- 



The top left plot of Figure [3] shows the Poisson errors and bootstrap errors for the 
Poisson point process model simulated on the 2x2 square. The bootstrap errors shown in 
this figure are from resampling with 0.33 x 0.33 squares. Also shown in the plot are the true 
errors as obtained from the estimates from 500 realizations. Notice that both the Poisson 
and bootstrap errors are close to the true errors. 

Figure [2] shows similar plots for various modified Thomas models, each with number 
density 500. The general behavior with increasing observation region size for the Poisson 
model occurs here as well. Thus to reduce the number of plots, we only include plots for the 
2 by 2 observation regions, and show the relative performance of the Poisson and bootstrap 
confidence intervals. 

We find that when the point pattern is only weakly clustered (left plot, for the case 
X p = 100, m = 2.5 and a = 0.15), the Poisson confidence intervals had empirical coverage 
close to the nominal 95% level. However, as the other two plots in Figure [2] show, the 
empirical coverages of the Poisson confidence intervals become lower than 95% as the degree 
and/or range of clustering increases (i.e. with smaller a or X p ). On the other hand, the 
boostrap confidence intervals attain coverage much closer to 95% for all the cases shown, 
regardless of the degree of clustering. Plots of the Poisson and bootstrap error estimates are 
shown in Figure [3j Notice that the Poisson approximation underestimates the true error as 
the degree of clustering increases, while the bootstrap error estimates remain close to the 
true errors, even for the modified Thomas model. 

Thus we find that the performance of the Poisson confidence intervals is sensitive to the 
degree of clustering of the point pattern. If the point pattern is Poisson, or weakly clustered, 
the empirical coverage of Poisson confidence intervals is close to the nominal level, even 
with small sample sizes. However, performance quickly deteriorates with greater degree of 
clustering. On the other hand, the bootstrap confidence interval does not perform well with 
small sample sizes. With moderate sample sizes, however, the bootstrap method performs 
rather well, over a wide range in the degree of clustering. 

We performed an additional set of simulations using data sets of roughly 50,000 points 
in a 20 x 20 square and estimating £(r) for r = 0.01 to 2. Other than the restriction to 
2D, the data size and range of r is roughly of the scale found in current astronomy data. 
We used the modified Thomas model and chose four sets of parameter values, varying the 
degree and range of clustering but with the same number density. A sample realization from 
each of the four models is shown in Figure HI The models corresponding to the top row in 
Figure H] have higher clustering than the models on the bottom row. For models in the same 
row, the strength of clustering is similar, but the model on the right has a longer correlation 
range. We used square resampling blocks of side length 5, 2.5 and 2 and results were very 
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similar. 

The results are summarized in Figure 0, which show the empirical coverage of confidence 
intervals (left column) and errors (right column) obtained from the marked point bootstrap 
and with Poisson errors, for each of the four Thomas models. The plots qualitatively show 
the same relative performance between Poisson errors and bootstrap as found in the earlier 
simulation study. When the range of clustering is large, the empirical coverage of confidence 
intervals based on Poisson errors and the normal approximation is very low (second and 
fourth plots on the left column of Figure E]). The coverage of the bootstrap intervals are 
affected too, but by much less. When the correlation is large, the Poisson errors substantially 
under-estimate the true errors, while the marked bootstrap errors were more realistic. At 
the larger values of r, especially when £ is near 0, the bootstrap procedure appears to be 
somewhat conservative, while the Poisson errors become more accurate. 

We made some time measurements of various sections of the algorithm: the functions 
computing DD and DR took 1 minute and 13 minutes respectively. Here, Nr = 200, 000 and 
we did not use any sophisticated methods (such as tree-based algorithms) to speed up the 
computation. The bootstrap function, generating 999 samples and computing the estimates, 
took roughly 1 minute, showing the feasibility of the procedure for large data sets. The speed 
of the marked point bootstrap is due to the fact that the marks that are resampled have 
already been computed as part of the estimation. The additional computational burden of 
the bootstrap is sampling the points and keeping track of the number of times each point is 
resampled. 

5. Discussion 

In this paper, we introduced the marked point bootstrap as a method to bootstrap 
spatial data for estimating errors without specific model assumptions. In particular, we 
described how the method can be applied to estimators of the two- and three-point correlation 
functions. With the non-parametric bootstrap, errors are obtained from the actual data. 
There is no need choose a model, select parameter values or generate mock catalogs using 
iV-body simulations. Thus errors obtained from non-parametric bootstrap can be used to 
compare with errors obtained from other methods with more specific model assumptions. 

For non-parametric spatial bootstrap, we propose the marked point bootstrap over the 
more common block bootstrap or subsampling methods. There are several advantages of 
the marked point bootstrap. Firstly, by using information from actual pairs or triplets of 
points in the data, bootstrap confidence intervals using the marked point bootstrap attain 
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better empirical c overage than confidence intervals constructed using block bootstrap (see 



Loh &: Stein! 120041 for a comparison of these two methods). 



Secondly, in the marked point bootstrap, it is the marks that are used to compute the 
bootstrap estimate. Thus the resampled points do not have to be arranged to form a new 
point pattern. This makes it a lot easier to boot strap data that are observed in irregularly 



shaped regions that are common in astronomy. In lLoh &: Steinl (12004 ) for example, bootstrap 
on an absorber catalog was done using slices as well as spheres, with similar results for both 
types of resampling regions. 

Thirdly, the marks used for resampling are part of the original estimate and are com- 
puted during the estimation step. The only additional computation required by the marked 
point bootstrap involves selecting points (that is, testing whether each point lies in a re- 
sampling region or not), and keeping track of the number of times each point is resampled. 
Unlike the block bootstrap, there is no need to re-compute from scratch the estimates for 
each bootstrap sample. This difference in computation is even greater for higher-order statis- 
tics. These properties of the marked point bootstrap make it a computationally feasible tool 
for analysis. 

Our study here suggests that non-parametric bootstrap can yield valid estimates of er- 
rors under a wide range of point patterns. The lack of specific model assumptions means that 
the non-parametric bootstrap method, and in particular the marked point bootstrap, can 
serve as an alternate and complementary method for quantifying errors. Having estimates 
of errors obtained using Poisson approximations, parametric and non-parametric bootstrap 
allows one to have a better sense of the size of errors involved in an analysis. 

The simulation study performed here shows that bootstrap confidence intervals do attain 
coverage close to the nominal level, even for the clustered point patterns where Poisson 
errors are known to be inaccurate, when sample sizes are large. More specifically, bootstrap 
performance improves with increasing number density, and also with increasing observation 
region size relative to the correlation length. Unfortunately, in astronomy, the correlation 
length may be of the same scale as the observation region. If the values of r at which the 
correlation function estimates are computed are small relative to the resampling blocks (and 
the observation region), then although the bootstrap procedure would distort the dependence 
structure at the large scales, it would still be valid for these smaller values of r. 

If, instead, £(r), say, for r close to the size of the observation region is of interest, then 
the bootstrap procedure would start to break down, in the sense that the empirical coverage 
of confidence may not be close to the nominal level, and the bootstrap errors not reflect the 
true errors. In this case, the amount of information contained in the data is smaller and the 
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boundary effects are magnified. With respect to the marked point bootstrap, larger blocks 
are needed to capture the dependence structure at this larger scale. For a fixed sample, this 
cannot be done without reducing the variability of the bootstrap samples. The method that 
might work best is parametric bootstrap, assuming that the model is correct, and that the 
parameter values used are close to the true values. Non-parametric bootstrap can still be 
useful here. Firstly, it is at least a better choice than Poisson errors, since the latter would 
grossly underestimate the true errors. Secondly, it can provide additional error estimates 
to compare with the errors obtained with the assumed model. For these most challenging 
instances, having a variety of methods can only be beneficial. 

This research is supported in part by National Science Foundation award AST-0507687. 
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